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Abstract. The purpose of this paper is to provide a detailed probabilistic analysis of the optimal 
control of nonlinear stochastic dynamical systems of the McKean Vlasov type. Motivated by the recent 
interest in mean field games, we highlight the connection and the differences between the two sets of 
problems. We prove a new version of the stochastic maximum principle and give sufficient conditions 
for existence of an optimal control. We also provide examples for which our sufficient conditions for 
existence of an optimal solution are satisfied. Finally we show that our solution to the control problem 
provides approximate equilibria for large stochastic games with mean field interactions. 



1. Introduction 

The purpose of this paper is to provide a detailed probabihstic analysis of the optimal control of 
nonlinear stochastic dynamical systems of the McKean- Vlasov type. The present study is motivated 
in part by a recent surge of interest in mean field games. 

We prove a version of the stochastic Pontryagin maximum principle that is tailor-made to McKean- 
Vlasov dynamics and give sufficient conditions for existence of an optimal control. We also provide a 
class of examples for which our sufficient conditions for existence of an optimal solution are satisfied. 
Putting these conditions to work at the solution of an optimal control problem leads to the solution 
of a system of Forward Backward Stochastic Differential Equations (FBSDEs for short) where the 
marginal distributions of the solutions appear in the coefficients of the equations. We call these equa- 
tions mean field FBSDEs, or FBSDEs of McKean- Vlasov type. To the best of our knowledge, these 
equations have not been studied before. A rather general existence result was recently proposed in 
[61 , but one of the assumptions (boundedness of the coefficients with respect to the state variable) pre- 
cludes applications to solvable models such that the Linear Quadratic (LQ for short) models. Here, 
we take advantage of the convexity of the underlying Hamiltonian and apply the so-called continua- 
tion method exposed in flT] in order to prove existence and uniqueness of the solution of the FBSDEs 
at hand, extending and refining the results of [6 1 to the models considered in this paper. The technical 
details are given in Section |5] 

Without the control at (see Eq. ([T]) right below), stochastic differential equations of McKean- 
Vlasov type are associated to special nonlinear Partial Differential Equations (PDEs) put on a rigorous 
mathematical footing by Henry McKean Jr in [11| . See also [il2l[T6l[T0l . Existence and uniqueness 
results for these equations have been developed in order to provide effective equations for studying 
large systems, reducing the dimension and the complexity, at the cost of handling non Markovian 
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dynamics depending upon the statistical distribution of the solution. In the same spirit, we show in 
Section[6]that our solution of the optimal control of McKean-Vlasov stochastic differential equations 
provides strategies putting a large system of individual optimizers in an approximate equilibrium, the 
notion of equilibrium being defined appropriately. The proof is based on standard arguments from 
the theory of the propagation of chaos, see for example [ 16, 10|. The identification of approximate 
equilibriums in feedback form requires strong regularity properties of the decoupling field of the 
FBSDE. They are proved in Section [5] 



2. Probabilistic Set-Up of McKean-Vlasov Equations 

In what follows, we assume that W = (Wf)o<t<r is an m-dimensional standard Wiener process 
defined on a probability space (il, J^, P) and F = {Tt)o<t<T is its natural filtration possibly aug- 
mented with an independent cr-algebra Tq. For each random variable/vector or stochastic process X, 
we denote by Fx the law (alternatively called the distribution) of X. 

The stochastic dynamics of interest in this paper are given by a stochastic process X = {Xt)o<t<T 
satisfying a nonlinear stochastic differential equation of the form 

(1) dXt = b{t,Xt,Fx„at)dt + a{t,Xt,Fx„at)dWt, < t < T, 

where the drift and diffusion coefficient of the state Xt of the system are given by the pair of de- 
terministic functions (6, cr) : [0,r] x M'^ x P2(M'^) x A ^ x M'^^™ and a = {at)o<t<T is a 
progressively measurable process with values in a measurable space {A, A). Typically, A will be an 
open subset of an Euclidean space M*^ and A the cr-field induced by the Borel cr-field of this Euclidean 
space. 

Also, for each measurable space {E,£), we use the notation V{E) for the space of probability 
measures on {E,£), assuming that the cr-field £ on which the measures are defined is understood. 
When E is, a metric or a normed space (most often W^), we denote by Vp{E) the subspace of V{E) 
of the probabiUty measures of order p, namely those probability measures which integrate the p-th 
power of the distance to a fixed point (whose choice is irrelevant in the definition of Vp{E)). The 
term nonlinear does not refer to the fact that the coefficients b and a could be nonlinear functions of 
X but instead to the fact that they depend not only on the value of the unknown process Xt at time t, 
but also on its marginal distribution Fx^ ■ We shall assume that the drift coefficient b and the volatility 
a satisfy the following assumptions. 

(Al) For each x G M'^, /x G V2(M'^) and a e A, the function [0,T] 3 t ^ {b,a){t,x, iJ,,a) G 

X M'^^'" is square integrable; 
(A2) 3c > 0, Vt G [0, T], Va G A, Vx, x' G M°', V^, n' G P2(M'^), 

\b{t, X, /U, a) — b{t, x' , /x', q)| + \a{t, x, fi, a) — a{t, x' , fi' , a)\ < c[\x — x'\ + W2(/i, fi')] ; 

where VF2(/i, fi') denotes the 2-Wasserstein distance. For a general p > 1, the p-Wasserstein distance 
Wp{fi, fi') is defined on Vp{E) by: 



\x — y\^7r{dx, dy) 
ExE 



; IT G V2{E X E) with marginals fi and fi' 



Wp{fi,iJ.') = inf ■ 

Notice that if X and X' are random variables of order 2, then W2{Fx,Fx') < [E|X - X'|2]i/2_ 
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The set A of so-called admissible control processes a is defined as the set of ^-valued progressively 
measurable processes a E H^''^, where H^'" denotes the Hilbert space 

H^'" := |z G H°'"; jZ^l^ds < +00} 

with H°'" standing for the collection of all ]R"-valued progressively measurable processes on [0, T]. 
By (A.l) and (A.2), any a G A satisfies 

fT 

E / [\bit,0,6o,at)\^ + \a{t,0,6o,at)\^]dt < 
Jo 

Together with the Lipschitz assumption (A.2), this guarantees that, for any a G A, there exists a 
unique solution X = X" of Q, and that moreover this solution satisfies 

(2) E sup \Xt\P < +00 

0<t<T 

for every p G [1,2]. See e.g. |[T6l[T0l for a proof. 

The stochastic optimization problem which we consider is to minimize the objective function 

(3) J(a) =e|^ f{t,Xt,Fx,,at)dt + g{XT,Pxr)Y 

over the set A of admissible control processes a = {at)o<t<T- The running cost function / is a real 
valued deterministic function on [0, T] 7^2 (IR'^) X A, and the terminal cost function (7 is a real 

valued deterministic function on x 7^2 (K*^)- Assumptions on the cost functions / and g will be 
spelled out later. 

The McKean-Vlasov dynamics posited in ([T]) are sometimes called of mean field type. This is 
justified by the fact that the uncontrolled stochastic differential equations of the McKean-Vlasov type 
first appeared in the infinite particle limit of large systems of particles with mean field interactions. 
See lfT2l [T6l [TOll for example. Typically, the dynamics of such a system of N particles are given by a 
system of stochastic differential equations of the form 

dXl = V{t, Xl X^)dt + a\t, X/, • • • , 

where the W^'s are N independent standard Wiener processes in M™, the (T*'s are A'' deterministic 
functions from [0, T] x R^^"^ into the space of d X m real matrices, and the 6*'s are deterministic 
functions from [0, T] x M^^'^ into R"'. The interaction between the particles is said to be of mean 
field type when the functions 6* and a* are of the form 

b^{t,xi,- ■ ■ ,xx) = b{t,Xi,ll^), and a^{t, xi, ■ ■ ■ , xx) = cr{t, Xi, fi^), i = l,---,N, 

for some deterministic function b from [0, T] x M"' x Vi {R'^) into M'^, and a from [0, T] x x Vi {R'^) 
into the space of d x m real matrices. Here, for each A^-tuple x = {xi, ■ ■ ■ , xat), we denote by Jl^ , 
or jl^ when no confusion is possible, the empirical probability measure defined by: 

1 ^ 

(4) U''{dx') = -Y.^.M^') 
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and for each x by 6x the unit point mass (Dirac) measure at x. We shall come back to this formulation 
of the problem in the last section of the paper when we use results from the propagation of chaos to 
construct approximate equilibriums. 

We emphasize that the optimization problem Q differs from the optimization problem encountered 
in the theory of mean field games. Differences between these optimization problems are discussed 
in m. When handling mean field games, the optimization of the cost functional Q is performed 
for a fixed flow of probability measures. In other words, the argument {^Xt)o<t<T in ([T]) and ([3]) is 
kept fixed as a varies and the controlled processes are driven by the same flow of measures, which 
is not necessarily the flow of distributions of the process {Xt)o<.t<T but only an input. Solving 
the corresponding mean field game then consists in identifying a flow of probability measures, that 
is an input, such that the optimal states have precisely the input as flow of statistical distributions. 
As highlighted in Section |6] the optimization problem for controlled McKean-Vlasov dynamics as 
we consider here, also reads as a limit problem as tends to infinity, of the optimal states of 
interacting players or agents using a common policy. 

Useful Notations. Given a function /i : R'^ — )■ M and a vector p G M^, we will denote by dh{x) ■ p 
the action of the gradient of h onto p. When /i : M*^ — )■ M^, we will also denote by dh{x) ■ p the 
action of the gradient of h onto p, the resulting quantity being an element of M^. When /i : M°' — )■ 
and p G M^, we will denote by dh{x) p the element of defined by dx[h{x) ■ p] where • is here 
understood as the inner product in M^. 

3. Preliminaries 

We now introduce the notation and concepts needed for the analysis of the stochastic optimization 
problem associated with the control of McKean-Vlasov dynamics. 

3. 1. Differentiability and Convexity of Functions of Measures. There are many notions of differ- 
entiability for functions defined on spaces of measures, and recent progress in the theory of optimal 
transportation have put several forms in the limelight. See for example lH] [171 for exposes of these 
geometric approaches in textbook form. However, the notion of differentiability which we find con- 
venient for the type of stochastic control problem studied in this paper is slightly different. It is more 
of a functional analytic nature. We believe that it was introduced by P.L. Lions in his lectures at the 
College de France. See [51 for a readable account. This notion of differentiability is based on the lift- 
ing of functions 7^2 (1^'^) 3 /U i— )• H{fj,) into functions H defined on the Hilbert space L^(17; M'') over 
some probabihty space {Q, T, F) by setting H{X) = H{F<^) for X e L2(0; M'^), Q being a Pohsh 
space and IP an atomless measure. Then, a function H is said to be differentiable at fiQ £ (1^*^) 
if there exists a random variable Xq with law /io, in other words satisfying Fj^^ = fiQ, such that the 

lifted function H is Frechet differentiable at Xq. Whenever this is the case, the Frechet derivative of 
H at Xq can be viewed as an element of L^(fi; M.'^) by identifying L'^{Cl; M.'^) and its dual. It then 
turns out that its distribution depends only upon the law /xq and not upon the particular random vari- 
able Xq having distribution jj^q. See Section 6 in [51 for details. This Frechet derivative [DH]{Xq) is 
called the representation of the derivative of H at fiQ along the variable Xq. Since it is viewed as an 
element of M"^), by definition, 

(5) H{fi) = HifiQ) + [DH]{Xq) ■ {X - Xq) + o{\\X - Xq^) 
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whenever X and Xq are random variables with distributions /i and /^o respectively, the dot product 
being here the L^- inner product over {Q,F,¥) and || • II2 the associated norm. It is shown in ||5l 
that, as a random variable, it is of the form /i(Xo) for some deterministic measurable function h : 
— )• M'^, which is uniquely defined ^UQ-almost everywhere on M^. The equivalence class of h in 
L^(M'^, ^0) being uniquely defined, we can denote it by df_iH{fio) (or dH{fio) when no confusion is 
possible): We will call d^H{ixo) the derivative of H at /io and we will often identify it with a function 
dfj,H{iJ,o){ • ) : M*^ 9 X I— )• d^H{^o){x) G M"^ (or by dH{ido){ ■ ) when no confusion is possible). 
Notice that d^H{fiQ) allows us to express [DH]{Xq) as a function of any random variable Xq with 
distribution hq, irrespective of where this random variable is defined. In particular, the differentiation 
formula ^ is somehow invariant by modification of the space (l and of the variables Xq and X used 
for the representation of H, in the sense that [DH]{Xo) always reads as 9^i7(/io)(^o)> whatever the 
choices of Cl, Xq and X are. It is plain to see how this works when the function H is of the form 

(6) H{fi)=l h{x)pL{dx) = {h,ii) 

for some scalar differentiable function h defined on R'^. Indeed, in this case, H{X) = K[h{X)] 
and DH{X) ■ Y = K[dh{X) • Y] so that we can think of d^H{fi) as the deterministic function dh. 
We will use this particular example to recover the Pontryagin principle originally derived in HH for 
scalar interations as a particular case of the general Pontryagin principle which we prove below. The 
example Q highlights the fact that this notion of differentiability is very different from the usual one. 
Indeed, given the fact that the function H defined by Q is linear in the measure fi when viewed as 
an element of the dual of a function space, one should expect the derivative to be h and NOT h' ! The 
notion of differentiability used in this paper is best understood as differentiation of functions of limits 
of empirical measures (or linear combinations of Dirac point masses) in the directions of the atoms 
of the measures. We illustrate this fact in the next two propositions. 

Proposition 3.1. If u is differentiable on V2{R'^), given any integer N > 1, we define the empirical 
projection of u onto by 




Then, is differentiable on iW^)^ and, for alii G {1, . . . , N\ 

dx,u^{x) = da;,u^{xi,...,XN) = jTdul^J26xA{xi). 

Proof. On (17, P), consider a uniformly distributed random variable over the set {!,..., N}. 
Then, for any fixed x = [xi, . . . ,X]\[) G (M'^)^, x^ is a random variable having the distribution 
fi^ = N^^ J2iLi ^Xi- In particular, with the same notation as above for u, 

(x) = {xi, . . . ,xn) = uix-&). 

Therefore, for /i = (/ii, . . . , /itv) G (M'^)^, 

+ h) = u{x^ + h^) = u{x^) + Du{x^) ■ + o(|/i|), 



6 RENE CARMONA AND FRANgOIS DELARUE 

the dot product being here the L^- inner product over {Q, F, P), from which we deduce 

1 ^ 

u'^ix + h)= + j^Y.^ u{fx^){Xi)hi + o{\h\), 

1=1 

which is the desired result. □ 

The mapping Du : L'^{Q;M.'^) — )• L^(J7;M'^) is said to be Lipchitz continuous if there exists a 
constant C > such that, for any square integrable random variables X and Y in L'^{Q; M"^), it holds 
||L'tt(X) — Du{Y)\\2 < C\\X — Y\\2. In such a case, the Lipschitz property can be transferred onto 
L^($7) and then rewritten as: 

(7) E[\du{Fx){X) - an(Py)(y)|2] < C^E[\X - y|2] , 

for any square integrable random variables X and Y in L'^{ft;'K'^). From our discussion of the 
construction of du, notice that, for each fi, du{^){ ■ ) is only uniquely defined /u-almost everywhere. 
The following lemma (the proof of which is deferred to Subsection [33]) then says that, in the current 
framework, there is a Lipschitz continuous version of du{n){ • ): 

Lemma 3.2. Given a family of Borel-measurable mappings {v{^){ • ) : R"^ — )■ R'^)^g-p2(R'*) indexed 
by the probability measures of order 2 on W^, assume that there exists a constant C such that, for any 
square integrable random variables ^ and ^' in L'^{^1; M"^), it holds 

(8) E[\viF^m-v(F^,){a\'] <c2E[ie-e?]. 

Then, for each ji G (1^'^). one can redefine •) on a fi-negligeable set in such a way that: 

Vx,^ gM°', \v{h){x) - v{i2){x')\ <C\x-x'\, 

for the same C as in ([8]l. 



By (|7]l, we can use Lemma 3.2 in order to define du{fi) (x) for every /i and every x while preserving 
the Lipschitz property in the variable x. From now on, we shall use this version of du. So, if 
/i, G 7^2 (IS'^) and X and Y are random variables such that Fx = ^ and Py = u, we have: 



E[\duip)iX) - du{i^)iX)\'^] <2\^[\duiiM){X) - du{i^)iY)\'^] +E[\duii^)iY) - du{iy)iX)\'^] 

<AC'^E\\Y - X\'^], 



where we used the Lipschitz property ^ of the derivative together with the result of Lemma 3.2 



applied to the function du{i'). Now, taking the infimum over all the couplings (X, Y) with marginals 
/i and V, we obtain 

inf ¥.[\^u{^l){X) - du{v){X)\'^] < AC^W2{Fx,FYf 

and since the left-hand side depends only upon fi and not on X as long as P^ = /x, we get 
(9) E[\du{fi){X) - du{u){X)\^] < AC^W2{^i, uf. 

We will use the following consequence of this estimate: 
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Proposition 3.3. Let u be a differentiable function on 7^2 (IR'^) with a Lipschitz derivative, and let 
/i G P2(M'^), and x= (xi, . . . ,xn) G (M"^)^ and y = (yi, . . . , vn) G (M'^)^. T/zen, w/f/z same 
notation as in the statement of Proposition \3J\ we have: 

. N N .1/2^ 

an^(x) ■{y-x) = -J2 du{ij){xi){y, - Xi) + O H^2(AiV, I N'^ ^ |xi - y^l^ J 
f/ie cfof product being here the usual Euclidean inner product and O standing for the Landau notation. 



Proof. Using Proposition 3.1 we get: 



N 



du^{x) • (y - x) = ^ dx,u^{x){yi - Xi) 

i=l 

1 ^ 

= ^ X] du{n^){x^){yi - Xi) 

1 ^ 1 

— du{fj,){xi){yi -Xi) + — Y[du{fi'^){xi) - du{fi){xi)]{yi 



N 



i=l 



Xi . 



Now, by Cauchy-Schwarz' inequality, 



1 ^ 

— Y[du{n^){xi) - duin){xi)]{yi - Xi) 

i=l 

/ 1 ^ \ 1/2 / 1 ^ \ 

= {^\^u{ll'^){x^)-^u{^,){x,)\']f'(-Y. 



1=1 



1/2 



/ 1 ^ N 1/2 

^ 1=1 ^ 



if we use the same notation for ^ as in the proof of Proposition 3.1 and apply the estimate ([9]) with 

X = x^, jjL = fl'^ and u = fi. □ 



Remark 3.4. We shall use the estimate of Proposition 3.3 when Xi = Xi and the Xi 's are independent 
-valued random variables with common distribution fi. Whenever /x G V2(^'^), the law of large 
numbers ensures that the Wasserstein distance between fi and the empirical measure fi^ tends to 
a.s., that is 

P( lim W2(Ai^,M) =0) = 1, 

n— >+oo 

see for example Section 10 in II15II . Since we can find a constant C > 0, independent of N, such that 

N 



i=l 
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we deduce from the law of large numbers again that the family ^))Ar>i is uniformly inte- 

grable, so that the convergence to also holds in the sense: 

(10) lim Erw|(/i^,^)l = 0. 

Whenever f^a \x\'^^^ fi{dx) < oo, the rate of convergence can be specified. We indeed have the 
following standard estimate on the Wasserstein distance between fi and the empirical measure fi^: 

(11) E[W^|(/Z^,/x)] <CiV-2/('^+^), 

for some constant C > 0, see for example Section 10 in l\5\ . Proposition 3.3 then says that, 



when N is large, the gradient of u at the empirical sample (^i)i<j<Ar is close to the sample 
(On(/i)(Xj))i<j<jv, the accuracy of the approximation being specified in the Lp'{^l) norm by (111 
when jjL is sufficiently integrable. 

3.2. Joint Differentiability and Convexity. 

Joint Differentiability. Below, we often consider functions 
M depending on both an n-dimensional x and a probability measure fi. Joint differentiability is 
then defined according to the same procedure: g is said to be jointly differentiable if the lifting g : 
M" X L^(fi;M"') 3 {x,X) i— g{x,F-^) is jointly differentiable. In such a case, we can define the 
partial derivatives in x and n: they read M'^ x 7^2 (I^*^) 3 {x,lJ.) H> dxg{x,fi) and R'^ x V2(R'^) 3 
(x, ^) i-> d^g{x, G L^(M'^, jj) respectively. The partial Frechet derivative of g in the direction 
X thus reads L^{n;R'^) 3 {x,X) ^ D^g{x,X) = df,g{x,F^){X) £ L'^{n;R'^). 

We often use the fact that joint continuous differentiability in the two arguments is equivalent with 
partial differentiability in each of the two arguments and joint continuity of the partial derivatives. 
Here, the joint continuity of dxg is understood as the joint continuity with respect to the Euclidean 
distance on M" and the Wasserstein distance on V2{R'^). The joint continuity of d^g is understood as 
the joint continuity of the mapping {x,X) ^ df,g{x,Fj^){X) fromM" x L'^{n;R'^) into L^{n;R'^). 
Whe n the partial derivatives of g are assumed to be Lipschitz-continuous, we can benefit from 
It says that, for any (x, /x), the representation R'^ 3 x' t-^ d^g{x, lj){x') makes sense as 



3.2 



Lemma 

a Lipschitz function in x' and that an appropriate version of (|9]l holds true. 

Convex Functions of Measures. We define a notion of convexity associated with this notion of 
differentiability. A function g on V2(R'^) which is differentiable in the above sense is said to be 
convex if for every ji and jj! in V2(R'^) we have 

(12) gin') - gifi) - E[d^g{f,){X) ■ {X' - X)] > 

whenever X and X' are square integrable random variables with distributions /i and /x' respectively. 



Examples are given in Subsection 4.3 



More generally, a function g on M" x V2 (R"^) which is jointly differentiable in the above sense is 
said to be convex if for every (x, fi) and {x', fi') in R^ x V2{'^'^) we have 

(13) g{x',i^') - g{x, /x) - d.gix, /x) • (x' - x) - E[d^gix, fi)iX) ■ {X' - X)] > 

whenever X and X' are square integrable random variables with distributions /x and fi' respectively. 
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3.3. Proof of Lemma 3.2 First Step. We first consider the case v bounded and assume tliat /x lias 
a strictly positive continuous density p on the whole W^, p and its derivatives being of exponential 
decay at the infinity. We then claim that there exists a continuously differentiable one-to-one function 
from (0, 1)°' onto M"' such that, whenever r]i, . . . ,7]^ are d independent random variables, each of 
them being uniformly distributed on (0, 1), U{r]i, . . . ,rid) has distribution /u. It satisfies for any 

{zi,...,zd) e (0,1)^ 

— {zi,...,Zd)^0, -^{zi,...,Zd) = 0, l<i<3<d. 

The result is well-known when d = 1. In such a case, U is the inverse of the cumulative distribution 
function of p. In higher dimension, U can be constructed by an induction argument on the dimension. 
Assume indeed that some tj has been constructed for the first marginal distribution /i of on R*^^^, 
that is for the push-forward of ji by the projection mapping M'^ 9 {xi, . . . , Xd) {xi, . . . , Xd-i)- 
Given (xi, . . . ,Xd-i) G M'^^^, we then denote by p{-\xi, . . . ,Xd^i) the conditional density of 
given the d — \ first coordinates: 

/ I \ P\Xl,---,Xd) ^ md-l 

p[Xd\Xi,...,Xd-l) = r, Xl, . . . G M , 

p(xi,...,Xrf_l) 

where p denotes the density of fi (which is continuously differentiable and positive). We then denote 
by (0, 1) 3 Zd ^ Ud{zd\xi, . . . , Xd-i) the inverse of the cumulative distribution function of the law 
of density p{ ■ . . . , Xd-i)- It satisfies 

Fd{Udizd\xi, . . .,Xd-i)\xi, . . . ,Xrf_i) = Zd, 



with 



/Xd 
p{y\xi, . . .,Xd-i)dy 
-oo 



which is continuously differentiable in (xi, . . . ,Xd) (using the exponential decay of the density at 
the infinity). By the implicit function theorem, the mapping W^^^ x (0, 1) 9 (xi, . . . , Xd~i, Zd) i— )• 
Ud{zd\xi, . . . , Xd-i) is continuously differentiable. The partial derivative with respect to Zd is given 
by 

dUd, I , 1 

r> \Zd\Xl, . . . , Xd—lj /TT / \ M \ ' 

dzd p{Ud{zd\xi, . . . ,Xd_i)|xi, . . . ,Xrf_i) 

which is non-zero. We now let 

U{zi,...,Zd) = (U{zi,. . . ,Zd-i),Ud{zd\tJ{zi,. . . ,Zd-i))), zi,...,Zd G (0,1)"'. 

By construction, U{rji, . . . , rjd) has distribution fi: [rji, . . . , has distribution fi and the con- 

ditional law of C/d(r/i, . . . , r]d) given rji, . . . , rjd-i is the conditional law of /i given the d — 1 first 
coordinates. It satisfies [dUd/ dzd][zi, . . . ,Zd) > and [dUi/dzd]{zi, Zd) = for i < d. In 
particular, since U is assumed (by induction) to be injective and [dUd/dzd\{zi, . . . , Zd) > 0,U must 
be injective as well. As the Jacobian matrix of U is triangular with non-zero elements on the diagonal, 
it is invertible. By the global inversion theorem, [/ is a diffeomorphism: the range of U is the support 
of /u, that is M'^. This proves that U is one-to-one from (0, 1)'* onto M.'^. 

Second Step. We still consider the case v bounded and assume that // has a strictly positive con- 
tinuous density p on the whole M'', p and its derivatives being of exponential decay at the infinity. 
We will use the mapping U constructed in the first step. For three random variables ^, ^' and G in 
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L^(ri;M'^), the pair (^,^') being independent of G, the random variables ^ and ^' having the same 
distribution, and G being A/'d(0, Id) normally distributed, ([8]) implies that, for any integer n > 1 

E [\v{i + n-^G, P5+„-ig) -v{i' + n-'G, P^+n-ic) I'] < C^E [\i - if] . 
In particular, setting 

(27r)'i/2 



•^«(^) = w/2 / ^(y-lP^+n-ic) exp(-n2 )dy, 



we have 

(14) E[\Vn{0-Vnia\']<C'm^-e\'']. 

Notice that Vn is infinitely differentiable with bounded derivatives. 

We now choose a specific coupling for ^ and Indeed, we know that for any rj = {rji, . . . , r]d) and 
r/' = {r][,. . . ,r]'^), with uniform distribution on (0, 1)'^, U{r]) and U{r]') have the same distribution 
as ^. Without any loss of generality, we then assume that the probability space {Q, P) is given by 
(0, l)'^ X M"' endowed with its Borel fi-algebra and the product of the Lebesgue measure on (0, 1)°' and 
of the Gaussian measure Afd{0, Id)- The random variables rj and G are then chosen as the canonical 
mappings rj : (0, l)'^ {z, y)^ zmdG: (0, 1)'* xR'^B {z, y) ^ y. 

We then define rj' as a function of the variable z G (0, 1)"' only. For a given z^ = [z^, . . . , z^) G 
(0, l)'^ and for h small enough so that the open ball B{z^, h) of center z^ and radius h is included in 
(0, 1)^, we let: 

'^^ \ z, outside 
where is the dth vector of the canonical basis. We rewrite ( 14i as: 

\vniU{7]{z))) -Vn{U{v'{z)))\^dz < [ \U{r]{z)) - U{ri'{z))\^dz, 



or equivalently: 



(15) 



\r\<h 



\vn [?7(z° + r - 2rded)] - Vn{U{z'^ + r)) |'dr 



<C^ [ \U{z° + r-2rded) -Uiz° + 

J\r\<h 



r)\ dr. 



I\r\ 

Since U is continuously differentiable, we have 

t;„(C/(zO + r)) = VniUiz^)) + dvn{U{z'')) ■ [5^7(2°) • r] + o(r), 
where dU {z^) is a (i x d matrix. We deduce 

d 

dxi ^' dzd 



. , fl^, f)TT. 

Vn[U{z' + T - 2r,e,)] - Vn{U{z' + r)) = -2 ^{U{z'))^y )rd + o{r) 

i=l 
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since dUi / dzd = for i 7^ d, and 



(16) 



Similarly, 



\r\<h 



OXd OZd J\r\<:h 



(17) / \Uiz° + r-2rded)-Uiz^+rfdr = 4\^iz'^)\^[ rjdr + oih^) 

J\r\<h O^d J\r\<h 



and putting together ( [T5| ), ( [161 ), ^^id jlTl )' we obtain 

Since [dUd/ dzd\{z^) is different from zero, we deduce that 

|||(C/(.»))P<C^ 

and since [/ is a one-to-one mapping from (0, l)'^ onto M^, and z^ G (0, l)'^ is arbitrary, we conclude 
that |[9i'n/c^Xd](x)| < C, for any x G W^. By changing the basis used for the construction of 
U (we used the canonical one but we could use any orthonormal basis), we have \SIvn{x)e\ < C 
for any x,e G with |e| = 1. This proves that the functions {vn)n>i are uniformly bounded 
and C-Lipschitz continuous. We then denote by v the limit of a subsequence converging for the 
topology of uniform convergence on compact subsets. For simplicity, we keep the index n to denote 
the subsequence. Assumption ([8]) implies: 

^\vn{i)-v{i,¥^)\^] <E[|t;(e + n-iG,P5+„-iG)-^;(e,IPc)l'] < C'^"', 
and taking the limit n — )• +00, we deduce that v and v{ ■ , P^) coincide P^ almost everywhere. This 
completes the proof when v is bounded and ^ has a continuous positive density p, p and its derivatives 
being of exponential decay at the infinity. 

Third Step. When v is bounded and ^ is bounded and has a general distribution, we approxi- 
mate by ^ + n~^G again. Then, ^ + n^^G has a positive continuous density, the density and 
its derivatives being of Gaussian decay at the infinity, so that, by the second step, the function 
3 X ^ v{x,¥^^y^-iQ) can be assumed to be C-Lipschitz continuous for each n > 1. Ex- 
tracting a converging subsequence and passing to the limit as above, we deduce that Pg) admits a 
C-Lipschitz continuous version. 

When V is bounded but ^ is not bounded, we approximate ^ by its orthogonal projection on the ball 
of center and radius n. We then complete the proof in a similar way. 

Finally when v is not bounded, we approximate v by {4'n{v))n>i where, for each n > 1, t/^^ is a 
bounded smooth function from M into itself such that ijjn {r) = r for r G [— n, n] and | [dil)n/dr] (r) | < 
1 for all r G M. Then, for each n > 1, there exists a C-Lipschitz continuous version of P^)). 
Choosing some xq G such that |?;(xo,P5)| < +00, the sequence ijjn{v{xQ,¥^)) is bounded so 
that the sequence of functions (V'n(v( • 1 '^^)))n>i is uniformly bounded and continuous on compact 
subsets. Extracting a converging subsequence, we complete the proof in the same way as before. 
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3.4. The Hamiltonian and the Dual Equations. The Hamiltonian of the stochastic optimization 
problem is defined as the function H given by 

(18) H{t^ X, fi, y, z, a) = b{t, x, fi,a) ■ y + a{t, x, fj,,a) • z + f{t, x, /i, a) 

where the dot notation stands here for the inner product in an Euclidean space. Because we need to 
compute derivatives of H with respect to its variable ^, we consider the lifting H defined by 

(19) H{t, X, X, y, z, a) = H{t, x, fi, y, z, a) 

for any random variable X with distribution fi, and we shall denote by df^H{t,x, fiQ,y, z,a) the 
derivative with respect to fi computed at fiQ (as defined above) whenever all the other variables t, x, 
y, z and a are held fixed. We recall that d^H{t, x, fiQ, y, z, a) is an element of L^(M'^, /io) and that 
we identify it with a function (9^if(t, X, -Z; a)( • ) : M'^ 9 x H> d^H{t,x, iJ.o,y, z,a){x). It 
satisfies DH{t, x, X, y, z, a) = d^H{t, x, fiQ, y, z, a){X) almost-surely under P. 

Definition 3.5. In addition to (Al-2), assume that the coefficients b, a, f and g are differentiable with 
respect to x and fi. Then, given an admissible control a = {at)o<t<T G ^> we denote by X = X" 
the corresponding controlled state process. Whenever 

(20) E r {\d^f{t, Xu Px,, at)|' + E X^ Px,, at)(^Ol'] }dt < +oo, 

Jo 

and 

(21) ¥.{\^^g{XTJxT)? + ^\^^,g{XTJxr){XT)?]] < +oo, 

we call adjoint processes of X any couple ((5^)o<t<T5 {Zt)o<t<T) of progressively measurable sto- 
chastic processes in H^''^ x H^''^^™ satisfying the equation (which we call the adjoint equation): 

r dYt = -d,H{t,Xt,Fx„Yt,Zt,at)dt + ZtdWt 

(22) I -E[d^H{t,Xt,Fx„Yt,Zt,at){Xt)]dt 

[ Yt = dMXT,rxr) + nd^giXT,rx^){XT)] 

where {X, Y, Z, a) is an independent copy of {X, Y, Z, a) defined on (17, P) and E denotes the 
expectation on (fi, J^, P). 

Notice that E[9^^(i, Xt, PjCt, ^t, ^t, ctt){Xt)\ is a function of the random variable Xt as it stands 

for E[d^,H{t,Xu¥x,,Yt,Zt,at){x%=x, (and similarly for E[5^5(1t, IPx^)(^t)]). Notice that, 
when b, a, f and g do not depend upon the marginal distributions of the controlled state process, the 
extra terms appearing in the adjoint equation and its terminal condition disappear and this equation 
coincides with the classical adjoint equation of stochastic control. 

Using the appropriate interpretation of the symbol as explained in Section [2] and extending this 
notation to derivatives of the form d^h{^){x) (Dp = • p\){x), the adjoint equation rewrites 

dYt = - [d^b{t, Xt,¥x, , at) &Yt + d^a{t, Xt, Fx, , at) Q Zt + d^f{t, Xt,Fx, , at)] dt 

+ ZtdWt 

(23) _ , . . 
-E[d^b{t,Xt,¥x,,at){Xt)QYt + d^a{t,Xt,Fx,,at){Xt)QZt 

+ d^f{t,Xt,¥x,,at){Xt)]dt, 



FBSDES AND MCKEAN VLASOV 



13 



with the terminal condition It = dc^giXr^Wxr) +^[dfj.giXT,^XT)i^T)]- Notice that 9^^. 6 and ^^^.o- 
are bounded since b and a are assumed to be c-Lipschitz continuous in the variable x, see (A2). Notice 
also that E[\d^b{t, Xt,Fx„at){Xt)\^]'/^ and E[\d^a{t, X*, Px,, a* are also bounded by 

c since b and a are assumed to be c-Lipschitz continuous in the variable with respect to the 2- 
Wasserstein distance. It is indeed plain to check that, given a differentiable function h : V2{^'^) ^ 
M, the notion of differentiability being defined as above, it holds E[|(?^/i(X)p]^/^ < c, for any 
/i G 7^2 (IR'^) and any random variable X having fi as distribution, when h is c-Lipschitz continuous 
in n with respect to the 2-Wasserstein distance. 

Notice finally that, given an admissible control a e A and the corresponding controlled state pro- 



cess X = X°', despite the conditions (20 -21 1 and despite the fact that the first part of the equation 
appears to be linear in the unknown processes Yt and Zt, existence and uniqueness of a solution {Y, Z) 
of the adjoint equation is not provided by standard results on Backward Stochastic Differential Equa- 
tions (BSDEs) as the distributions of the solution processes (more precisely their joint distributions 
with the control and state processes a and X) appear in the coefficients of the equation. However, a 
slight modification of the original existence and uniqueness result of Pardoux and Peng [13] shows 
that existence and uniqueness still hold in our more general setting. The main lines of the proof are 
given in H, Proposition 3.1 and Lemma 3.1. However, Lemma 3.1 in iH doesn't apply directly 
since the coefficients {d^b{t,Xt,Fxt,at){Xt)QYt)o<t<T and {d^a{t, Xt,Fxt,0!t){Xt) Q Zt)o<t<T 
are not Lipschitz continuous in Y and Z uniformly in the randomness, see Condition (CI) in H. 
Actually, a careful inspection of the proof shows that the bounds 

EE[\d^b{t,Xt,Fx„at){Xt)QYt\^] <c'E[\Yt\^], 



EE[\d^a{t,Xt,rx„at){Xt) Q Ztl""] < c'E[\Z, 



are sufficient to make the whole argument work and thus to prove existence and uniqueness of a 
solution (Y, Z) satisfying 



E 



T 

sup \Yt\'^+ I \Zt\^dt 

,0<t<T 



< +00. 



4. PONTRYAGIN PRINCIPLE FOR OPTIMALITY 

In this section, we discuss sufficient and necessary conditions for optimality when the Hamiltonian 
satisfies appropriate assumptions of convexity. These conditions will be specified next, depending 
upon the framework. For the time being, we detail the regularity properties that will be used through- 



out the section. Referring to Subsection 3.2 for definitions of joint differentiability, we assume: 

(A3) The functions b, a and / are differentiable with respect to (x, a), the mappings (x, /i, a) i— )• 
dx{b, (J, f){t, X, /i, a) and (x, fi, a) ^ da{b, a, f){t, x, fi, a) being continuous for any t G [0, T]. The 
functions b, a and / are also differentiable with respect to the variable fi in the sense given above, 
the mapping R'^ x L2(17;M'^) x A B {x,X,a) ^ df,{b,aJ){t,x,Fx,a){X) £ L^{n;R'^'"^ x 
]^(a!xm)xa! ^ being continuous for any t G [0, T]. Similarly, the function g is differentiable with 
respect to x, the mapping (x, /u) i— dxg{x, fi) being continuous. The function g is also differentiable 
with respect to the variable ^, the mapping R'^ x L'^{fl; M^) H> dfj.g{x, Fx){X) £ L'^{Q; R'^) being 
continuous. 
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(A4) The coefficients ((6, a, f){t, 0, 6o, 0))o<t<r are uniformly bounded. Tiie partial derivatives 
dxb and dab are uniformly bounded and the norm of the mapping x' i— )• d^{b,a){t,x, iJ.,a){x') 
in L^(]R'^,/i) is also uniformly bounded (i.e. uniformly in (t, a)). There exists a constant L 
such that, for any R > and any (t, x, fi, a) such that ||/i||2, |q;| < R, \dx{f, g){t, x, ^, a)| and 
\daf{t, X, /X, a)\ are bounded by L(l + R) and the L'^{W^, //)-norm of x' i-)- di_i{f,g){t, x, fj., a)(x') 
is bounded by L(l + i?). Here, we have used the notation 

W^Wl = I \x\'^dfi{x), fi e r2(M.'^). 

4. 1. A Necessary Condition. We assume that the sets A and A of admissible controls are convex, we 
fix a G A, and as before, we denote by X = the corresponding controlled state process, namely 
the solution of ([T]l with given initial condition Xq = xq. Our first task is to compute the Gateaux 
derivative of the cost functional J at a in all directions. In order to do so, we choose /3 G H^'^ such 
that a + e/3 G A for e > small enough. We then compute the variation of J at a in the direction of 
/3 (think of /3 as the difference between another element of A and a). 

Letting {9t = {Xt,Fxt,ctt))o<t<T, we define the variation process V = (Vj)o<t<T to be the 
solution of the equation 

(24) dVt = [-ft ■ Vt + <5t(P(x„vi)) + Vt] dt + [7t • Vt + ~6t0^ix„v,)) + m] dWt, 

with Vq = 0, where the coefficients ^t, ^t, Vt^ It, and fjt are defined as 

7t = dxb{t, Ot), 7t = dxa{t, Ot), rjt = d^b^t, Ot) ■ Pt, and fjt = dacr{t, Ot) ■ A, 

which are progressively measurable bounded processes with values in M''^"', ]^(rfxm)xrf^ 

j^dxjrt respectively (the parentheses around dxm indicating that 7^ • u is seen as an element of M'^xm 

whenever u G M°'), and 



(25) 



6t = E[d^b{t, 0t){Xt) ■ Vt] = E[d^b{t, X, Fx„a){Xt) ■ Vt 
5t = t[df,a{t,et){Xt) ■ Vt] = t[df,a{t,x,¥x,,a){Xt) ■ Vt 



t I \ x=Xt 
OL — c^t 



Oi — OL± 



which are progressively measurable bounded processes with values in and M^^xm respectively and 
where (Xf, V^) is an independent copy of (Xt, Vt). As expectations of functions of (X^, Vt), bt and 



bt depend upon the joint distribution of Xt and Vt- In ( [24| ) we wrote (5t(P(Xt,yt)) ^'^^ (Xtyt)) 
in order to stress the dependence upon the joint distribution of Xt and Vt- Even though we are 
dealing with possibly random coefficients, the existence and uniqueness of the variation process is 
guaranteed by Proposition 2.1 of [ 10] applied to the couple (X, F) and the system formed by ([T]) and 



( |24| ). Because of our assumption on the boundedness of the partial derivatives of the coefficients, V 
satisfies E supo<f<r | |^ < cxd for every finite p > 1. 

Lemma 4.1. For each e > small enough, we denote by the admissible control defined by 
= at + e(3t, and by X"^ = X°' the corresponding controlled state. We have: 

2 

= 0. 



(26) lim E sup 

e\0 0<t<T 
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Proof. For the purpose of this proof we set 61 = (X| , , al ) ^i^d = f-^^ {^t ~ ^t) — Vt. Notice 
that Vq = and that 



(27) 

+ 



^ [b{t, 91) - b{t, Ot)] - d^h{t, 9t) • Vt - dab{t, Ot) • A - i; [d^h{t, et){Xt) ■ Vt] 



^ 91) - a{t, 9t)] - d^t, Ot) ■ Vt - do,a{t, 9t) • A - E %a{t, 9t){Xt) ■ Vt] 

= Vt'^dt + Vt'^dWt. 
Now for each t £ [0, T] and each e > 0, we have: 

^ [b{t, 9t) - b{t, 9t)] = C d,b{t, 9f^') ■ {Vt' + Vt)dX + C dab{t, 9f'') ■ AdA 
Jo Jo 



dt 

dWt 



e 



+ ft[dXt, ofn {x^'-) ■ (Vt' + vt)]d\, 

Jo 

where, in order to simplify a little bit the notation, we have set X^''^ = Xt + Ae(y/ + Vt), a^'"^ = 
at + XePt and 9f '"^ = {X^ P^a.e, ''^). Computing the 'df-term, we get: 

y/'^= [' d,b{t,9l^')-Vt'dX+ ft[dXt,9l%X^'')-Vl]d\ 
Jo Jo 

+ / [d^b{t, ef'') - d^b{t, 9t)] ■ Vtdx + [ [dab{t, 9f'') - d^b{t, 9t)] ■ 

Jo Jo 

+ [ E[{dXt,Ot'')iX,^n - d^bit,9t){Xt)) ■ Vt]dX 
Jo 

= C d,b{t,9^^')-Vt'dX+ ft[d,b{t,9l^%xl'').Vt']dX + lf+lf + lf. 
Jo Jo 

The three last terms of the above right hand side are bounded in L^([0, T] x Q), uniformly in e. Indeed, 
the Lipschitz regularity of b implies that dxb and dab are bounded and that d^b{t, x,Fx, ct){X) is 
bounded in L'^{n; W^), uniformly in {t, x, a) G [0, T] x x A and X £ L^{n, M°'). Next, we treat 
the diffusion part V/'^ in the same way using Jensen's inequality and Burkholder-Davis-Gundy's 
inequality to control the quadratic variation of the stochastic integrals. Consequently, going back to 
( |27] l, we see that, for any S G [0, T], 



E sup <c' + c' / E sup 

0<t<S Jo 0<s<t 



where as usual c' > is a generic constant which can change from line to line. Applying Gronwall's 
inequality, we deduce that E supQ<«y | V^'^P < c'. Therefore, we have 

limE[ sup sup IX^'^ - =0. 

':\0 0<A<10<i<T 
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We then prove that /^'^ fi'^ and I^'^ converge to in L^([0, T] x il) as e \ 0. Indeed, 



; /' \if\''dt = E r [\d,b{t,ef'') -d,b{t,9t)]Vtdx 

Jo Jo Jo 

\d,b{t,9^'') - d,b{t,6t)\^\VtfdXdt. 

Since the function dxb is bounded and continuous in x, fj. and a, the above right-hand side converges 
to as e \ 0. A similar argument appUes to I^''^ and I^''^. Again, we treat the diffusion part V/'^ 
in the same way using Jensen's inequaUty and Burkholder-Davis-Gundy's inequaUty. Consequently, 
going back to (27 1, we finally see that, for any S € [0, T], 

fS 

E sup \Vf\^dt<c' / E sup \V,'\^dt + 6e 

0<t<S Jo 0<s<t 

where lime\o = 0. Finally, we get the desired result applying Gronwall's inequality. □ 
We now compute the Gateaux derivative of the objective function. 

Lemma 4.2. The function a i— )• J {a) is Gateaux differentiate and its derivative in the direction (3 is 
given by: 

d , , , 

.oox :r'^(« + ^/^)L=o = ^ / [dxf{t,et)-Vt + E[d^f{t,9t){Xt)-Vt] + daf{t,et)-(3t]dt 

{16) ae Jq 

+ E[dxg{XT,Fxr) ■ Vt + E[d^g{XT,¥xr){XT) ■ Vt]] . 
Proof. We use freely the notation introduced in the proof of the previous lemma. 



E 



E 
< E 



-1 



dt 



7' 

Jo 



(29) 



d 
de 



J{a + e/3) 



1. 



e=0 



lim-E/ [f{t,et)-f{t,et)]dt 



+ lim -E [giX^, Pxj) - giXr, Fx, 



Computing the two limits separately we get 

''^[f{t,9t)-f{t,9t)]dt 



lim ^E 

e\o e 



lim -E 



e\o e Jq Jq dX 

cT fl 



fit, e^'')dXdt 



limE / / [d,f{t,dl^').{Vt^ + Vt) 
<:\o Jo Jo 

+ E[d^f{t, e^'') (1^^) • (y/ + Vt)] + d^f{t, e^'') ■ a] axdt 



E 



[dj{t, Ot) -Vt + E [d,j{t, et){Xt) ■ Vt] + d^f{t, Ot) ■ A] dt 



using the hypothesis on the continuity and growth of the derivatives of /, the uniform convergence 
proven in the previous lemma and standard uniform integrability arguments. The second term in ([29]l 
is tackled in a similar way. □ 
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Observing that the conditions d20-21 1 are satisfied under (Al-4), the duality relationship is given by: 



Lemma 4.3. Given {Yt, Zt)o<t<T in Definition 3.5 it holds: 

E[lr -Vt] = E r [Yt ■ {d^b{t, Ot) ■ A) + Zt ■ {d^t, Ot) ■ ft) 
Jo 

(30) - d^fit.Ot) ■ Vt-E[d^fit,9t){Xt) ■ Vt]] dt. 

Proof. Letting Qt = {Xt, Pxt, Y^t, Zt, at) and using the definitions ^24\ of the variation process V, 
and ([22]) or ^23\ of the adjoint process Y, integration by parts gives: 



T 







T 







Yt-Vt = Yo-Vo + J YfdVt+ I dYt ■ Vt + 
Mt + 



T 



d[Y,V]t 







Yt ■ {d,b{t, Ot) ■ Vt) + Yt • ]E[5^6(t, et){Xt) ■ V] + Yt ■ {dab{t, Ot) ■ A) 
- d.,H{t, Qt) ■ Vt -t[d^H{t, Qt){Xt) ■ Vt] 

+ Zt ■ {dMt, Ot) ■ Vt) + Zt • ]E[5^a(t, et){Xt) ■ Vt] + Zt ■ {d^a{t, Ot) ■ A) 



dt, 



where {Mt)o<t<T is a mean zero integrable martingale. By taking expectations on both sides and 
applying Fubini's theorem: 

EE[d^,H{t,et)iXt)-Vt] 
= EE[d^H{t,et){Xt)-Vt] 

= EE[{df,b{t,0t){Xt) ■ Vt) ■ Yt + {d^,ait,et)iXt) ■ Vt) ■ Zt + d^f{t,et){Xt) ■ v]. 



By commutativity of the inner product, cancellations occur and we get the desired equality ( 30 1. □ 



By putting together the duality relation pO| ) and ( |28| ) we get: 
Corollary 4.4. The Gateaux derivative of J at a in the direction of P can be written as: 
d 



(31) 



de 



Jia + e/3)l^^ = E 



r daH{t,Xt,Fx„Yt,at)-Pt 
Jo 



dt. 



Proof. Using Fubini's theorem, the second expectation appearing in the expression ( |28l ) of the Gateaux 
derivative of J given in Lemma [4!2] can be rewritten as: 

E[dMXT,Fx^) • Vt + E{d^g{XT,Fx^){XT) ■ Vt)] 

= E[d,g{XT,¥x^) ■ Vt] + EE[d^g{XT,¥ ^^){Xt) ■ Vt] 
= E[Yt-Vt], 



and using the expression derived in Lemma 4.3 for E\Yt ■ Vt] in (28 1 gives the desired result. □ 
The main result of this subsection is the following theorem. 
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Theorem 4.5. Under the above assumptions, if we assume further that the Hamiltonian H is convex 
in a, the admissible control a £ A is optimal, X is the associated (optimally) controlled state, and 
(y, Z) are the associated adjoint processes solving the adjoint equation \22) , then we have: 



(32) VaeA, H{t,Xt,Yt,Zt,at) <H{t,Xt,Yt,Zt,a) a.e. in t e [0,T], F - a.s. 

Proof. Since A is convex, given /3 G A we can choose the perturbation = at + e{l3t — at) which 
is still in A for < e < 1. Since a is optimal, we have the inequality 



J{a + ei/3-a))\ =E [ do,H{t, Xt,Yt, Zt,at) ■ i(3t - at) > 0. 

Jo 



By convexity of the Hamiltonian with respect to the control variable q G A, we conclude that 

E r[Hit, Xt, Yt, Zt, Pt) - H{t, Xt, Yt, Zt, at)]dt > 0, 
Jo 

for all /3. Now, if for a given (deterministic) a ^ Awe choose (3 in the following way. 



a if (t,cj) G C 

Qt(cj) otherwise 



for an arbitrary progressively-measurable set C C [0, T] x Q, (that is C n [0, t] B{[0,t]) Tt ior 
any t G [0, T]), we see that 

E / lc[Hit, Xt, Yt, Zt, a) - H{t, Xt, Yt, Zt, at)]dt > 0, 
Jo 

from which we conclude that 

H{t, Xt, Yt, Zt, a) - H{t, Xt,Yt, Zt, at) > dt (S) dF a.e. , 
which is the desired conclusion. □ 

4.2. A Sufficient Condition. The necessary condition for optimality identified in the previous sub- 
section can be turned into a sufficient condition for optimality under some technical assumptions. 

Theorem 4.6. Under the same assumptions of regularity on the coefficients as before, let a ^ K be 
an admissible control, X = X'^ the corresponding controlled state process, and {Y, Z) the corre- 
sponding adjoint processes. Let us also assume that for each t G [0, T] 

is convex; 

(2) X 7^2 (IK'^) X A 9 (x, ^, a) I— )• H{t, x, fi, Yt,Zt, a) is convex dt ® dP almost everywhere. 
Moreover, if 

(33) }i{t,Xt,Fx^,Yt,Zt,at)= --mi }i{t,Xt,Fx„Yt,Zt,a), dt(^dFa.e. 

then a is an optimal control, i.e. J{a) = inf^/gA J{a'). 
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Proof. Let a' G A be a generic admissible control, and X' = X^' the corresponding controlled 
state. By definition of the objective function of the control problem we have: 

J(a)- J(a') = E[5(XT,Px^)-5(^^,Px')] +E ^ [fit,et) - fit,9[)]dt 

Jo 

(34) =E[g{XT,Fxr) - giX^,^x')] + E / [H{t,et) - H{t,Q[)\dt 

Jo 

- E / { [b{t, Ot) - b{t, e't)] ■ Yt + [a{t, Ot) - a{t, 9't)] ■ Zt]dt 
Jo 

by definition of the Hamiltonian, where 9t = {Xt,fxf,(^t) and @t = {^t^^Xt^Yt^ Zt,oit) (and 
similarly for 9[ and The function g being convex, we have 



so that 



(35) 



g{x, ix) - g{x\ n') < {x - x') ■ d^g{x) + E [d^,g{x, ij){X) ■ {X - X')] , 
E[5(XT,Px^)-5(^r,Px^)] 

< K[d:,g{XT,Fxr) ■ {Xt - Xlr)+t[d^g{XT,Fxr){XT) ■ {Xt - X^)]] 
= K[{d^g{XT,Fxr)+nd^9{XT,'Pxr){XT)]) ■ {Xt - X^)] 
= E[Yt- {Xt - Xlp)] = E [{Xt - X^) • Yt] , 



where we used Fubini's theorem and the fact that the 'tilde random variables' are independent copies 
of the 'non-tilde variables' . Using the adjoint equation and taking the expectation, we get: 



E[{Xt - X^) ■ Yt] 



E 



dt 



^{Xt - X^) ■ dYt + TYf d[Xt - X[] + r[a{t, 9t) - a{t, 9[)] ■ Zt 
Jo Jo Jo 

= -E r[d^H{t,@t)-{Xt-Xi)+E[df,H{t,Qt){Xt)]-{Xt-Xi)]dt 
Jo 

+ E / [[b{t, 9t) - b{t, 9',)] ■ Yt + [a{t, 9t) - a{t, 9'^)] ■ Zt] dt, 
Jo 

where we used integration by parts and the fact that Yt solves the adjoint equation. Using Fubini's 
theorem and the fact that Qt is an independent copy of @t, the expectation of the second term in the 
second line can be rewritten as 



E 



(36) 



/ {E[d^H{t, Qt){Xt)] ■ {Xt - X[)}dt = EE / {[d^H{t, et){Xt)] ■ {Xt - X^)}d 
Jo Jo 

= E E [d^H{t, @t) {Xt) ■ {Xt - Xi)] dt. 
Jo 
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Consequently, by ( [34| ), ( [35] ) and p6^ , we obtain 



J{a) - J{a) <E [ [H{t, Qt) - H{t, e't)]dt 
Jo 

f {d,H{t, et) ■ {Xt - x[) + E[d^H{t, et){Xt) • {Xt - x'^)] }dt 

Jo 



(37) 

-E 



< 

because of the convexity assumption on H, see in particular ( pj] ), and because of the criticality of the 
admissible control {at)o<t<T, see ([33]l, which says the first order derivative in a vanishes. □ 

4.3. Special Cases. We consider a set of particular cases which already appeared in the literature, 
and we provide the special forms of the Pontryagin maximum principle which apply in these cases. 
We discuss only sufficient conditions for optimality for the sake of definiteness. The corresponding 



necessary conditions can easily be derived from the results of Subsection 4.1 



Scalar Interactions. In this subsection we show how the model handled in (T\ appears as a specific 
example of our more general formulation. We consider scalar interactions for which the dependence 
upon the probability measure of the coefficients of the dynamics of the state and the cost functions is 
through functions of scalar moments of the measure. More specifically, we assume that: 

b{t, X, /i, a) = b{t, X, (-0, /u), a) <7(t, x, fi, a) = a{t, x, {(p, /i), a) 
f{t, X, fi, a) = f{t, x, (7, fi),a) g{x, /x) = g{x, {(, /x)) 

for some scalar functions tp, (j), 7 and C with at most quadratic growth at 00, and functions b, a and / 
defined on [0, T] x M"* x M x A with values in W^, W^^™- and M respectively, and a real valued function 
g defined on x M. We use the bracket notation (/i, fi) to denote the integral of the function h with 
respect to the measure /i. The functions b, a, f and g are similar to the functions b, a, f and g with 
the variable /i, which was a measure, replaced by a numeric variable, say r. Reserving the notation 
H for the Hamiltonian we defined above, we have: 

H{t, X, n, y, z, a) = b{t, x, {^p, i^),a)-y + a{t, x, {(j), fi),a) ■ z + f{t, x, (7, fi),a). 

We then proceed to derive the particular form taken by the adjoint equation in the present situation. 
We start with the terminal condition as it is easier to identify. According to ( |22] ), it reads: 

Yt = dMXT,Fx^) + nd^g{XT,Fj^^){XT)]. 

Since the terminal cost is of the form g{x, fi) = g{x, {(, fi)), given our definition of differentiability 
with respect to the variable fi, we know, as a generalization of Q, that d^g{x, /u)( • ) reads 

d^g{x, fi){x') = drg{x, (C, l^))dC{x'), x' G M^. 
Therefore, the terminal condition Yr can be rewritten as 

Yt = d,g{XT,E[C{XT)]) +E[drg{XT,nC{XT)])]dC{XT) 

which is exactly the terminal condition used in 121 once we remark that the 'tildes' can be removed 
since Xt has the same distribution as Xt- Within this framework, convexity in /i is quite easy to 
check. Here is a typical example borrowed from [2]: if g and g do not depend on x, then the function 
'P2(IK'^) 9 A* H> g{fi) = g{{C, fj)) is convex if C, is convex and g is non-decreasing and convex. 
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Similarly, d^j,H{t, x, /i, y, z, a) can be identified to the M'^-valued function defined by 

d^H{t,x,fi,y,z,a){x') = [drb{t,x, {ii,^i),a) y]dil){x') + [dra{t,x, {(j),n),a) z]d(l){x') 

+ drf{t,x, (7,^), a) d'y{x') 
and the dynamic part of the adjoint equation ( |22l ) rewrites: 

dYt = -{^,bit,Xt,E[^p{Xt)],at)QYt + ^Mt,Xt,E[^{Xt)],at)QZt 

+ dJ{t,Xt,E[-f{Xt)],at)}dt + ZtdWt 
- {^^Mt,Xt,E[lP{Xt)],at) Q^^^P{Xt) +^^ra{t,Xt,E[^^^ 

+ E [drfit, Xt, E[j{Xt)],dt)] d-tiXt)}dt, 
which again, is exactly the adjoint equation used in ||2l once we remove the 'tildes'. 

Remark 4.7. The mean variance portfolio optimization example discussed in HI and the solution 
proposed in [3] and fSl of the optimal control of linear-quadratic (LQ) McKean-Vlasov dynamics are 
based on the general form of the Pontryagin principle proven in this section as applied to the scalar 
interactions considered in this subsection. 

First Order Interactions. In the case of first order interactions, the dependence upon the probability 
measure is linear in the sense that the coefficients b, a, f and g are given in the form 

b{t, X, fj,, a) = {b{t, X, ■ ,a), fj,) a{t, x, ^, a) = {a^t, x, ■ , a), ^) 
f{t, x, p,, a) = {f{t, X, • , a), fi) g{x, n) = {g{x, • ),p) 

for some functions b, a and / defined on [0, T] xR'^ X X A with values in R'^, M'^^™ and 
R respectively, and a real valued function g defined on R'^ x R'^. The form of this dependence 
comes from the original derivation of the McKean-Vlasov equation as limit of the dynamics of a large 
system of particles evolving according to a system of stochastic differential equations with mean field 
interactions of the form 

N N 

(38) dXl = -Y.^b{t,XlXl)dt + -Y.^^*^^t^^t)dW^, i = lr--,N, 0<t<T, 
j=i i=i 

where W^'s are independent standard Wiener processes in R'^. In the present situation the linearity 
in fi implies that d^g{x, p){x') = dx'g{x, x') and similarly 

d^H{t, X, /i, y, z, a){x') = dx'b{t, x, x' , a) Q y + dx'(7{t, x, x' , a) Q z + d^'fit, x, x' , a), 
and the dynamic part of the adjoint equation (|22]) rewrites: 

dYt = -E[dxH{t, Xt, Xt, Yt, Zt, at) + dx'H{t, Xt, Xt, % Zt, at)] dt + ZtdWt, 
if we use the obvious notation 

H{t, X, x' , y, z, a) = b{t, x, x , a) ■ y + a{t, x, x , a) • z + f{i, x, x , a), 
and the terminal condition is given by 

Yt = E[dxg{XT, Xt) + d^'giXr, Xt)] . 
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5. Solvability of Forward-Backward Systems 

We now turn to the application of the stochastic maximum principle to the solution of the optimal 
control of McKean-Vlasov dynamics. The strategy is to identify a minimizer of the Hamiltonian, and 
to use it in the forward dynamics and the adjoint equation. This creates a coupling between these 
equations, leading to the study of an FBSDE of mean field type. As explained in the introduction, 
the existence results proven in [7] and [6] do not cover some of the solvable models (such as the LQ 
models). Here we establish existence and uniqueness by taking advantage of the specific structure of 
the equation, inherited from the underlying optimization problem. Assuming that the terminal cost 
and the Hamiltonian satisfy the same convexity assumptions as in the statement of Theorem |4.6[ we 
indeed prove that unique solvability holds by applying the continuation method, originally exposed 
within the framework of FBSDEs in lfT4l . Some of the results of this section were announced in the 
note 181. 

5.1. Technical Assumptions. We now formulate the assumptions we shall use from now on. These 
assumptions subsume the assumptions (A 1-4) introduced in Sections [2] and |4] As it is most often 
the case in applications of Pontryagin's stochastic maximum principle, we choose A = M'^ and we 
consider a linear model for the forward dynamics of the state. 

(Bl) The drift h and the volatility a are linear in /x, x and a. They read 

X, ^, a) = bo{t) + bi{t)fl + b2{t)x + bs{t)a, 
a{t,x,fi,a) = ao{t) + ai{t)fl + a2{t)x + cJ3(t)a, 

for some bounded measurable deterministic functions bo, 6i, 62 and 63 with values in M"^, M''^'*, W^^'^ 
and M^^'^ and ao, ai, 02 and da with values in M'^^™, m('^x'»)x'^, ]g(<ixm)xd j^(dxm)xfc ^-^j^g 
parentheses around d x m indicating that ai{t)ui is seen as an element of M'^xm whenever Ui G M'^, 
with i = 1, 2, or tij G M.^, with i = 3), and where we use the notation fi = j x dfi{x) for the mean of 
a measure /i. 

(B2) The functions / and g satisfy the same assumptions as in (A.3-4) in Section |4] (with respect 
to some constant L). In particular, there exists a constant L such that 

\f{t,x',fi',a) - /(t,x,/x,a)| + \g{x',fi') - g{x,fi)\ 

< L[1 + + |x| + \a'\ + \a\ + \\fi\\2 + WfJ-'h] [\{x',a') - {x,a)\ + W2(m',/u)]- 

(B3) There exists a constant c > such that the derivatives of / and g with respect to (x, a) and x 
respectively are c-Lipschitz continuous with respect to (x, a, /i) and {x, fi) respectively (the Lipschitz 
property in the variable fi being understood in the sense of the 2-Wasserstein distance). Moreover, 
for any t G [0, T], any x, x' G W^, any a, a' G M^, any fi, jj,' G V2(S.'^) and any M'^-valued random 
variables X and X' having and /i' as respective distributions, 

E[\d^f{t, x', /x', a'){X') - d^f{t, X, fi, a)(X)|2] < c(|(x', a') - {x, a)\^ + E[\X' - Xf] ) , 
E[\d,9{x',fi'){X') - d,gix,fi)iX)\'] < c{\x' - x\^ + E[\X' - X^]). 
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(B4) The function / is convex with respect to (x, /i, a) for t fixed in such a way that, for some 

A > 0, 

f{t,x',n',a) - f{t,x,fi,a) 

- d(:r,a)f{t,x,n,a) • (x' -x,a - a) -E[d^f(t,x,n,a){X) • {X' - X)] 



> X\a 



a 



|2 



whenever X, X' € A, P; M'^) with distributions ^ and /u' respectively. The function g is also 

assumed to be convex in (x, ^) (on the same model, but with A = 0). 

We refer to Subsection |3 . 2 1 f or a precise discussion about (B3) and (B4). By comparing (|7]) with 
(B3), notice in particular that the liftings L'^{n; R'^) B X ^ f{t, x,F^,a) and L'^{n; W^) B X ^ 



g{x,¥j^) have Lipschitz continuous derivatives. As a consequence, Lemma 3.2 applies: For any 
t € [0, T], X e M'^, /i G V2iR'^) and a G M'^, there exist versions of R'^ B x' ^ d^f{t, x, ^, a)(x') 
and M*^ 9 x' I— )• dfj_g{x, n){x') that are c-Lipschitz continuous. 

Following Example ([6]), we also emphasize that h and a obviously satisfy (B3). 

5.2. The Hamiltonian and the Adjoint Equations. The drift and the volatility being linear, the 
Hamiltonian has the form 

H{t, X, y, z, a) = [bo{t) + bi{t)p. + b2{t)x + b3{t)a] ■ y 

+ [cro(t) + ai{t)fl + a2{t)x + a3{t)a] ■ z + f{t, x, fi, a), 

for te [0, T], X, y G M^, z G M'^''™, fi G p2(K'^) and a G M''. Given (t, x, /i, y, z) G [0, T]xR'^ x 
V2{R'^) xR'^x M'^^™, the function M'^ 9 a i-^ H{t, x, fi, y, z, a) is stricdy convex so that there exists 
a unique minimizer a{t, x, fi, y, z): 

(39) a{t, X, fx, y, z) = aTgmin^^^kH{t, x, /i, y, z, a). 

Assumptions (Bl-4) above being slightly stronger than the assumptions used in Q, we can follow 
the arguments given in the proof of Lemma 2.1 of [7J in order to prove that, for all [t, x, /x, y, z) G 
[0, T]xR'^x V2{R'^) xR'^x M'^^"', the function [0, T] x M'' x V2{R'^) xR'^ xR'^''"' B {t, x, /i, y, z) ^ 
a{t, X, ^, y, z) is measurable, locally bounded and Lipschitz-continuous with respect to (x, /i, y, z), 
uniformly in t G [0, T], the Lipschitz constant depending only upon A, the supremum norms of 63 and 
Us and the Lipschitz constant of daf in {x, p). Except maybe for the Lipschitz property with respect 
to the measure argument, these facts were explicitly proved in Q. The regularity of a with respect 
to /i follows from the following remark. If (t, x, y, z) G [0, T] x M'^ x M*^ x R'^^"' is fixed and /i, ^' 
are generic elements in 'P2iR'^), a and a' denoting the associated minimizers, we deduce from the 
convexity assumption (B4): 

2A|q;' — < {a — a) ■ [daf(t, x, fi, a) — daf{t, x, fi, a)] 

= {a' - a) ■ [daH[t, x, fj,, y, z, a') - daH[t, x, n, y, z, a)] 

(40) = (a' - a) ■ [daH{t, x, /i, y, z, a) - daH[t, x, ^i',y, z, a')] 

= {a' - a) ■ [daf{t,x,fj,,a') - daf{t,x,fi',a')] 

< C\a' -a\ VF2(/x',^), 
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the passage from the second to the third following from the identity 

daH{t, X, iJ,, y, z, a) = daH{t, x, fi' , y, z, a) = 0. 

For each admissible control a = {at)o<t<T, if we denote the corresponding solution of the state 
equation hy X = (Xf )o<f<T> then the adjoint BSDE ^22) introduced in Definition 3.5 reads: 

-d^f{t, Xt, Fx„at)dt - h\{t)Ytdt - 4{t)Ztdt + ZtdWt 



dYt 



(41) 



E [d^f{t, Xt, IPx, , at) (Xt)] dt - h\ imiYt] - a\m[Zt]dt. 



Given the necessary and sufficient conditions proven in the previous section, our goal is to use the con- 
trol {at)o<t<T defined by at = a{t, Xt, ^Xt: ^t, Zt) where a is the minimizer function constructed 
above and the process {Xt, Yt, Zt)o<t<T is a solution of the FBSDE 



(42) 



dXt = [bo{t) + bi{t)E[Xt]+b2{t)Xt + bs{t)a{t,Xt,Fx„Yt,Zt)]dt 

+ [aoit)+aiit)E[Xt]+a2it)Xt + a3it)ait,Xt,Fx„Yt,Zt)]dWt, 
dYt = -[dj{t,Xt,Fx„a{t,Xt,Fx,,Yt,Zt))+bl{t)Yt + al{t)Zt]dt + ZtdWt 



with the initial condition Xq = xq, for a given deterministic point xq G M , and the terminal condition 

Yt = d^g{XT,Fxr) + E[a^5(XT, PxJ(^t)]. 

5.3. Main Result. Here is the main existence and uniqueness result: 



Theorem 5,1, Under (Bl-4), the forward-backward system (42 1 is uniquely solvable. 



Proof. The proof is an adaptation of the continuation method used in lfT4l to handle standard FBSDEs 
satisfying appropriate monotonicity conditions. Generally speaking, it consists in proving that exis- 
tence and uniqueness are kept preserved when the coefficients in (|42]l are slightly perturbed. Starting 
from an initial case for which existence and uniqueness are known to hold, we then establish Theorem 
|5.1| by modifying iteratively the coefficients so that ([42]l is eventually shown to belong to the class of 
uniquely solvable systems. 

A natural and simple strategy then consists in modifying the coefficients in a linear way. Unfortu- 
nately, this might generate heavy notations. For that reason, we use the following conventions. 

the notation {Qt)o<t<T stands for the generic notation for denoting a 



First, as in Subsection 



4.1 



pdxm 



We 



process of the form {Xt, Fxt ,Yt,Zt, at)o<t<T with values in W x 7^2(M'^) x M"* 
will then denote by S the space of processes {Qt)o<t<T such that {Xt, Yt, Zt, at)o<t<T is {J^t)o<t<T 
progressively-measurable, {Xt)o<t<T and {Yt)o<,t<T have continuous trajectories, and 



(43) 



lei 



E 



sup 

o<t<r 



\Xtf + \Yt\^] 



+ 



\Zt\^ + \atf\dt 



.1/2 



< +00. 



Similarly, the notation {9t)o<t<T is the generic notation for denoting a process {Xt, Pxti at)o<t<T 
with values in x 7^2 (1^'^) x AH the processes {6t)o<t<T that are considered below appear as 
the restrictions of an extended process {@t)o<t<T £ §■ 

Moreover, we call an initial condition for (|42]) a square-integrable J^) -measurable random variable 
^ with values in M'^, that is an element of L'^{Q,, Fq, P; M'^). Recall indeed that Fq can be chosen as a 
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cr-algebra independent of {Wt)o<t<T- In comparison with the statement of Theorem 5.1 this permits 
to generalize the case when ^ is deterministic. 

Finally, we call an input for (|42]l afour-tupleX = ((Xf ,Xf ,2/)o<t<T,2]j,)> {^^)o<t<T, {^")o<t<T 
and (X'^)o<t<T being three square integrable progressively-measurable processes with values in M'^, 
j^cixm ^jjj respectively, and denoting a square-integrable J-y-measurable random variable with 
values in M"'. Such an input is specifically designed to be injected into the dynamics of (42 1, being 
plugged into the drift of the forward equation, I'^ into the volatility of the forward equation, 
into the bounded variation term of the backward equation and into the terminal condition of the 
backward equation. The space of inputs is denoted by I. It is endowed with the norm: 



(44) 



E 



141' + 



1/2 



We then put: 

Definition 5.2. For any 7 £ [0, 1], any (, G L'^{n, Tq, P; M"^) and any input I £l, the FBSDE 

dXt = {jb{t,et) +I^)dt + {ja{t,et) +I^)dWt, 

dYt = -{-f{d.,H{t,Qt) +^^^H{t,Qt){Xt)\} +Zl)dt + ZtdWt, t G [0,r], 
with the optimality condition 

(46) at = a{t,Xt,¥x,,Yt,Zt), te[0,r], 

and with Xq = as initial condition and 

Yt = j{dMXT,^Xr)+ndi,giXT,PXr){XT)]} +4 

as terminal condition, is referred to as I). 

Whenever (Xt, Yt, Zt)o<t<T ci solution, the full process {Xt, lYt^Zt, at)o<t<T referred to 
as the associated extended solution. 

Remark 5.3. The way the coupling is summarized between the forward and backward equations in 



(A5_)is a bit different from the way Equation ( |42[ ) is written. In the formulation used in the statement of 
Lemma ^?2\ the coupling between the forward and the backward equations follows from the optimality 
condition ( |46| ). Because of that optimality condition, the two formulations are equivalent: When 
7 = 1 andX = 0, the pair (45 -46]l coincides with (42 1. 

The following lemma is proved in the next subsection: 

Lemma 5.4. Given 7 G [0, 1], we say the property {Sj) holds true if, for any G L^{Q,, To,F; W^) 
and any I £ I, the FBSDE 8['^,^,X) has a unique extended solution in S. With this definition, 
there exists (5o > such that, if [S^) holds true for some 7 G [0, 1), then (5^+^) holds true for any 
T] G (0, 60] satisfying 7 + < 1. 



Given Lemma 5.4 Theorem 5.1 follows from a straightforward induction as (Sq) obviously holds 
true. □ 
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5.4. Proof of Lemma[5]4j The proof follows from Picard's contraction theorem. As in the statement, 
consider indeed 7 such that (Sj) holds true. For 77 > 0, ^ € L'^{ft, To,^','^'^) and X G I, we then 
define a mapping $ from S into itself whose fixed points coincide with the solutions of <?(7 + rj, ^, X). 

The definition of <I> is as follows. Given a process G S, we denote by 0' the extended solution 
of the FBSDE ^, X') with 

l'/ = rjb{t,9t)+lt 



rf,' 



t ) 



t ' 



X/-' = r?a,F(t, Qt) + r]t[d^H{t, Qt){Xt)\ +Xi 
Xf = l^d^giXTjxr) + vt[d^g{XT,^Xr){XT)\ + Xf,. 

By assumption, it is uniquely defined and it belongs to S, so that the mapping $ : 1— )• 0' maps S 
into itself. It is then clear that a process G S is a fixed point of <I> if and only if is an extended 
solution of <S(7 + ?7, ^, X). The point is thus to prove that <1> is a contraction when rj is small enough. 
This is a consequence of the following lemma: 

Lemma 5.5. Let 7 G [0, 1] such that (S-y) holds true. Then, there exists a constant C, independent of 
7, such that, for any ^, ^' G ($7, J-q , P; M"') and X, X' G I, the respective extended solutions and 
0' o/i?(7, ^,X) and iS(7, ^',X') satisfy: 

||0-0'||§<C(E[|e-C?]'/' + ||X-X'||i). 



Given Lemma 5.5 we indeed check that <I> is a contraction when 77 is small enough. Given 0^ and 
0^ two processes in S and denoting by 0''^ and 0''^ their respective images by we deduce from 
Lemma [531 that 

||0'''-0'''||s<Cr?||0i-02||g, 

which is enough to conclude. 

5.5. Proof of Lemma |53} The strategy follows from a mere variation on the proof of the classical 
stochastic maximum principle. With the same notations as in the statement and with the classical 
convention for expanding as (Xj, Fx* , , -^t , ai)o<i<T and for letting {Ot = {Xt, Fxt , at))o<t<T, 
we then compute 

E[{X!r-XT)-YT] =E[{^'-0-Yo] 

-7|lE^ [d,H{t,Qt) ■ {X't - Xt) + E[d^H{t,Qt){Xt)] ■ {X^ - Xt)]dt 

-e£ [[b{t, e[) - bit, Ot)] ■ Yt + [ait, e[) - ait, Ot)] ■ Zt] dtj 

- |e jJI^ [iX[ - Xt) ■ X/ + (Xf - 1'/) . Yt + (Xr - Xr') • Zt] dt^ 
= To - 7X1 - T2. 
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Moreover, following ( [35] ), 

E[{X't - Xt) ■ Yt] = -fE[{dMXT,Fx^) + E[d^g{XT,Fx^){XT)]) • (X^ - Xt)] 
+ E[{I^' -I^)-Yt] 
< ^E[g{X^,Fx^) - 9{Xt,Fx^)] +E[(Xf -X^) • Yt]. 

Identifying the two expressions right above and then repeating the proof of Theorem |4.6[ we obtain 

(47) 7 J(a') - 7 J(a) > 7AE / \at - a'fdt + T0-T2+E [{I^ - Xf,'') • Yt] . 

Jo 

Now, we can reverse the roles of a and a' in Denoting by Tq and T2 the corresponding terms in 
the inequality and then making the sum of both inequalities, we deduce that: 

fT 

27AE / \at - a'tl'^dt + To + - {T2 + T^) + E[(X|, - Xf,'') • {Yt - Y^)] < 0. 
Jo 

The sum T2 + reads 

= Ef [-{ll - X/'') • {X, - X',) + (X," - 1'/) • (Y, - YD + (Xr - Xr') • {Zt - Z[)] dt. 
Jo 



Similarly, 



To + U = -E[{C-a-{yo-YD]. 



Therefore, using Young's inequality, there exists a constant C (the value of which may increase from 
line to line), C being independent of 7, such that, for any e > 0, 



(48) 



7E 



r \at - a[\'dt < e||G - e'lli + ^(E[|e - C?] + - n!)- 
Jo £ 



Now, we observe by standard estimates for BSDEs that there exists a constant C, independent of 7, 
such that 



E 



(49) 



sup |yt-y/|2+ / \Zt-z'fdt 

.0<t<T Jo 



< C7E 



sup \Xt — X'f-\'^ + I \at — a'i\^dt 

0<t<T 



'|2, 



+ C||X-X 



■/||2 



Similarly, 

(50) E[ sup \Xt - < E[|^ - ] + C7E / 

o<t<T Jo 



at-afdt + C\\I-T\\f. 



■/||2 
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From ( [49] ) and ( [SO] ) and then from ( |48l ), we deduce that 

E 



sup sup \Yt-Yl\'^+ / 

o<t<r o<t<T Jo 



T 



JO 

< C.||G - G'lli + ^ (E[|e - Cf] + ||X - X'||2). 
Using the Lispchitz property of a (t, •) and then choosing e small enough, we complete the proof. 

5.6. Decoupling Field. The notion of decoupling field, also referred to as 'FBSDE value function', 
plays a main role in the machinery of forward-backward equations, since it permits to represent the 
value Yt of the backward process at time t as a function of the value Xt of the forward process at 
time t. When the coefficients of the forward-backward equation are random, the decoupling field is a 
random field. When the coefficients are deterministic, the decoupling field is a deterministic function, 
which solves some corresponding partial differential equation. Here is the structure of the decoupling 
field in the McKean-Vlasov framework: 

Lemma 5.6. For any t G [0, T] and any ^ G L?{^, J-t, P; W^), there exists a unique solution, denoted 
by {Xl'^ , Ys'^ , Z\'^)t<s<T, of ( |42[ ) when set on [f, T] with X^'^ = as initial condition. 

In this framework, for any G ^2(1^'^). there exists a measurable mapping •, /i) : M"^ 9 x 1— >• 
u{t, X, 11) such that 

(52) P(y/'« = n(t,e,P5)) = 1. 

Moreover, there exists a constant C, only depending on the parameters in (Bl-4), such that, for any 

t G [0,T] andany£},i^ G L2(J7, Ji, P; M*^), 

(53) E[|n(^,e^P5l)-^^(^,^^Pg2)|2] <CE[|^i-e2|2]. 

The proof is given right below. For the moment, we notice that the additional variable P^ is for 

free in the above writing since we could set v{t, ■) = -jP^) and then have y/'^ = v(t, ^). The 
additional variable Pg is specified to emphasize the non-Markovian nature of the equation over the 
state space W^: starting from two different initial conditions, the decoupling fields might not be 
the same, since the law of the initial conditions might be different. Keep indeed in mind that, in 
the Markovian framework, the decoupling field is the same for all possible initial conditions, thus 
yielding the connection with partial differential equations. Here the Markov property holds, but over 
the enlarged space x V2{^'^), thus justifying the use of the extra variable P^. Nevertheless, we 
often forget to specify the dependence upon P^ in the sequel of the paper. 

An important fact is that the representation formula (|52]l can be extended to the whole path: 

Proposition 5.7. Under (Bl-4), for any ^ G L?'{VI,J-q,¥';W^), there exists a measurable mapping 
v:[{},T]xW^ such that 

P(Vi G [0,r], Y^'^ = v{t,X^^'^)) = 1. 

It satisfies supo<f<2^ |f (t, 0)| < +00. Moreover, there exists a constant C such that v{t, •) is C- 
Lipschitz continuous for any t G [0, T]. 
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We start with 



Proof of Lemma^ Given t G [0,r) and ^ G L2(J7,Ji,P;] 
when set on [t, T] with ^ as initial condition is a direct consequence of Theorem 5.1 (or, more pre 



existence and uniqueness to ( 42 1 



cisely, of the proof of it since we are handling a random initial condition). Using as underlying 
filtration the augmented filtration F* generated by ^ and by [Wg — Wt)t<s<T, we deduce that y/'^ 
coincides a.s. with a (T(^)-measurable M'^-valued random variable. In particular, there exists a mea- 
surable function u^{t, •) : M"' M°' such that P(y/'^ = n^(t, ^)) = 1. 

We now claim that the law of y/'^) only depends upon the law of ^. This directly follows from 
the version of the Yamada-Watanabe theorem for FBSDEs, see Since uniqueness holds pathwise, 
it also holds in law so that, given two initial conditions with the same law, the solutions also have 
the same laws. Therefore, given another M'^-valued random vector ^' with the same law as ^, it holds 
{C,u^{t, ^)) ~ 1*5' (^) Ill particular, for any measurable function : M°' — )■ M°', the random 
variables u^{t, ^) — v{£,) and u^'{t, — v{^') have the same law. Choosing v = n^(t, •), we deduce 
that u^i{t, •) and u^{t,-) are a.e. equal under the probability measure Pg. Put it differently, denoting 
by fj, the law of ^, there exists an element u{t, •, /i) G L^(M'^, fi) such that u^{t, •) and u^'{t, •) coincide 
// a.e. with u{t, •,//). Identifying u{t, ■, /i) with one of its version, this proves that 



fy 



u 



When t > 0, we notice that, for any /x G V2{^'^), there exists an J"t-measurable random variable ^ 
such that fi = P^. In such a case, the procedure we just described permits to define u{t, fi) for any 
/i G 7^2 (IK'^)- The situation may be different when t = as 7"o may reduce to events of measure 
zero or one. In such a case, can be enlarged without any loss of generality in order to support 
M'^-valued random variables of any arbitrarily prescribed distribution. 

The Lipschitz property ([53]) of n(0, •, •) is a direct consequence of Lemma 5.5 with 7 = 1. By a 
time shift, the same argument applies to u{t, •, •). □ 



We now turn to 



Proof of Proposition \5. 7\ For simplicity, we just denote {X^'^, Y^'^, Z^'^)Q^t<T by {Xt, Yt, Zt)o<t<T- 
The proof is then a combination of Lemmas 3.2 and 5.6 Indeed, given t G (0, T], Lemma [s!6l says 
that the family (n(t, ■ , fJ-)) ^izp^i^''-) satisfies ([8]l since any /x G V2(^'^) can be seen as the law of some 
J't-measurable random vector (. Therefore, for fi = Fxt, we can find a mapping 'w{t,-) that is C- 



Lipschitz continuous (for the same C as in (53 1) and that coincides with u{t, -jPxt) a.e. under the 
probability measure Fxt ■ It satisfies 



(54) 



VtG[0,r], F{Yt = w{t,Xt)) = 1, 



since Yt = Y^'^^. In particular, 

sup \w{t,0) 



(55) 



0<t<T 



< sup E[\Yt\] + sup E[\w{t,Xt)-w{t,0)\] 

0<t<T 0<t<T 

< sup E[|y|]+C; sup E[\Xt] <+oo. 

0<t<T 0<t<T 
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For any integer n> 1, we then let 

on 



= l[o,T/2"](*)w^(^,2;) + ^ l((fc_i)r/2",fcr/2"](*)^«(-^,a;), [0,r], x G M''. 

k=2 

Denoting by u"'* the zth coordinate of f " for any z G {1, . . . , d}, we also let 

v\t,x) = limsupv"''(t,a;), t G [0,r], x G M'^, 

n— >+oo 

and then v{t, x) = x), . . . , x)). As each of the is a Borel measurable function on 

[0, T] X W^, so is V. Similarly, v satisfies ( |55| ) and, for any t G [0, T],v{t,-) is C-Lipschitz continuous. 
Finally, we notice that, for any t G D„ = {kT/2'^, /c G {1, . . . , 2"}}, with n G N \ {0}, and for 

v\t,-) = w{t,-)=v{t,-), 

so that, •) = v{t, •) for any t G D = U„>iD„. Therefore, D being countable, we deduce from 
( [541 ) that the event 

^ = {w G J] : Vt G 0,Yt{uj) = w{t,Xt{uj)) = v{t,Xtioj))} 
has measure F{A) = 1. On the event A, we notice that, for any t G (0, T], 

Yt= lim Yt„= lirn 

where {tn)n>i is the sequence of points in D such that, for any n > 1, i„ G On and tn — T/2"-<t< 
tn- Since f •) is C-Lipschitz continuous, we deduce that, on A, 

Yt= lim v{tn,Xt), 

which is to say that the sequence {v{tn, Xt))n>i is convergent. Now we observe that v{tn,Xt) is 
also Vn{t,Xt). Therefore, the limit must coincide with v{t,Xt). This proves that, on the event A, 
Yt = v{t,Xt) for any t G (0,T]. By the same argument, the same holds true at t = (0 being 
handled apart for questions of notation since the definition of Vn at time is rather specific). □ 



6. Propagation of Chaos and Approximate Equilibrium 

In this section, we show how the solution of the optimal control of stochastic dynamics of the 
McKean-Vlasov type can be used to handle A^-player games when N tends to +oo. 

Throughout this section, assumptions (Bl-4) are in force. For each integer > 1, we consider 
a stochastic system whose time evolution is given by a system of A^ coupled stochastic differential 
equations of the form 

1 ^ 

(56) dUl = b{t, Ul z^f , Pl)dt + a{t, Ul z^f , Pl)dWl l<i<N- = ^Y.^ul^ 

i=i 

with t G [0, T] and Uq = xq, I < i < N. Here ((/3(*)o<t<T)i<i<Af are A^ R^-valued processes 
that are progressively measurable with respect to the filtration generated by {W^, . . . , W^) and have 
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finite norms over [0, T] x Q,: 



Vi G {l,...,iV}, E / \f3i\'^dt < +00, 
Jo 

where, for convenience, we have fixed an infinite sequence ((Wj )o<t<T)j>i of independent m- 
dimensional Brownian motions. One should think of as the (private) state at time t of agent 
or player i G {1, • " " , -^}> PI being the action taken at time t by player i. For each 1 < i < A^, we 
denote by 

rT 



(57) 



g{U^,u¥)+ [ fit, 
Jo 



-N 



,(3l)dt 



the cost to the ith player. We then recover the same set-up as in the case of the mean field game 
models studied in |7 |. Anyhow, the rule we apply for minimizing the cost is a bit different. The point 
is indeed to minimize the cost over exchangeable strategies: when the strategy (3 = {/3^, ■ ■ ■ , (3^) is 
exchangeable, the costs to all the players are the same and thus read as a common cost J^'*(/3) = 
J^(/3). From a practical point of view, restricting the minimization to exchangeable strategies means 
that the players are intended to obey a common policy, which is not the case in the standard mean 
field game approach. 

In this framework, one of our goal is to compute the limit 



lim inf J 



N 



the infimum being taken over exchangeable strategies. Another one is to identify, for each integer A^, 
a specific set of e-optimal strategies and the corresponding state evolutions. 

6.1. Limit of the Costs and Non-Markovian Approximate Equilibriums. Recall that we denote 
by J the optimal cost: 

(58) J = E ' 



g{XT,fJ.T)+ / f{t,Xt,fJ.t,a{t,Xt,fxt,yt,Zt))dt 
Jo 



where (Xt, Yt, Zt)o<t<T is the solution to (42 ) with Xq = xq as initial condition, {fJ,t)o<t<T denoting 
the flow of marginal probability measures fit = ^Xt , for < t < T. 

For the purpose of comparison, we introduce {X^,--- ,X^), each X* standing for the solu- 
tion of the forward equation in (|42]) when driven by the Brownian motion W^. Put it differently, 
(X^, • • • , X^) solves the system (56i when the empirical distribution v^^ is replaced by fxt and /3j is 
given by (51 = a\ with 

6^, = a{t,Xl,fXt,Yi,Zi), 



the pair {Y^ , Z*) solving the backward equation in (|42]l when driven by W^. Pay attention that the 
processes ((9^ = {XI, fit, Y^ , Z\, a''^)Q<t<T)\<i<N are independent. 
Here is the first result: 



Theorem 6.1. Under assumptions ( Bl-4 ), 



lim inf J 



N 



iP) = J, 
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the infimum being taken over exchangeable (square integrable) strategies (3 = (/3^, • • • , /3^). More- 
over, the non-Markovian control a = (a^, • • • , a^) is an approximate optimal control in the sense 
that 

Urn J^ia) = J. 

7V-5-+00 

Proof. The proof consists in comparing J^(/3) to J for a given exchangeable strategy /3. Once again, 
it reUes on a variant of the stochastic maximum principle exposed in Section |4] With the above 
notation, we indeed obtain 

the identity holding true for any 1 < i < A^. Therefore, we can write 



(59) 
with 



J''{I3)-J = TI + TI 



E[g(C/. 



rpl rjil rj-\'i 

— -^2,1 ~ ^2,2 ~ -^2,3' 

where we used Fubini's theorem with the independent copies denoted with a tilde ' ~ '. 
Analysis ofT^. Using the diffusive effect of independence, we claim 



rpl 

-'2,3 



1 

N 



N 



+ 0(E[|i/^-l^|2]^/^E 



N 



N 



N 



d^gixi, lST){Xir) -E[d^g{X'T, ^it){X't) 



1/2 



^5;E[([/f-X^).a^5(^^,MT)(^^)]+E[|C/^-X^|2]i/'0(iV-i/2) 



where 0{-) stands for the Landau notation. Therefore, taking advantage of the exchangeability in 
order to handle the remainder, we obtain 

N N N 

V E ^2,3 = 7^ E - X^t) ■ dMX'T^ ^t)(X^)] + E[|C/1 - X^\'] "'0{N~^/'). 



N 



i=l 



j=l i=l 
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Introducing a random variable "d from {Q., F, P) into M with uniform distribution on {1, ... , A^} as 
done in the proof of Proposition 3.1 we can write 



N N 



1=1 



Finally, defining the flow of empirical measures 

N 



and using (B3), Propositions 
the above estimate gives: 



3.1 



and 



N 



3.3 



to estimate the distance VF2(/U^, /^t), 



and also Remark 



3.4 



N 



N 



i=i 



where we used the notation l^id) for any function of which could be used as an upper bound for: 



(60) 



,1/2 



+ 



1/2 



0{lN{d)). 



By Remark [3!4j the left-hand side tends to as tends to +oo, since the function [0,T] 3 t ^ 
E[VF2 (/uf', /^t)] can be bounded independently of A^. Therefore, {iN{d))N>i is always chosen as a 
sequence that converges to as A^ tends to +oo. When supo<i<r \^t\ has finite moment of order 



d + 5, Remark 3.4 says that iN^d) can be chosen as A^ in any case, we will assume that 

£N{d)> A^-^/^oing back to i 



N N . 

i=l 1=1 



where we used local Lipschitz property of g and Remark 3.4 to replace by 



.AT 



Noting that a.s. under P, the law of (resp. X^) under P is the empirical distribution (resp 



/i^), we can apply the convexity property of g, see ( [T3| ), to get 



(61) 



1 ^ 

^^T2^>-(l + E[|t/^-XiH^/')0(^^(d)) 



1=1 
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Analysis ofTl. Using Ito's formula and Fubini's theorem, we obtain 



Tl = E 



(62) 



E 



[H{s, Ul i>f , y;, Zl, PI) - H{s, Xl,fis,Y:, Zl, al))ds 

T _ 

{Ui - Xi) ■ d,H{s, Xi, Y:, zl a\)ds 

'^{ui - 11) ■ d^H{s, xj, /X,, y;, Zlal){ll)ds 



- EE 



rj-ii rpi rpi 



Using the local Lipschitz property of the Hamiltonian and ( [60| ) and recalling that the limit process 
{XI, fit, y/, Zl, al)o<t<T has finite S-norm (see ([43])), we get: 



T{ 1 = E 







H{s,Ui,u^,Y,\Zl,Pi)-H{s,Xl,flf,Y:,Zl,al))ds 



+ 0{iNid)). 



Similarly, by exchangeability: 

rT 



1,2 



E 



{Ul-Xl)-d,H{s,Xl,llf,Y:,Zl,al)ds 







-T _ \ 1/2 

+ (E/ \Ui-Xi\'^dsj 0{iN{d)). 







Finally, using the diffusive effect of independence, we have 



N 



N 



•I 

1,3 



1 



N 



1=1 







(63) 



N ^ 

^ N N 



T 



j=l i=l 



{ui - xi) . d^H{s, xl,^is, y:, Zl,^l){Xl)ds 

iU:-Xl)-d^H{s,Xi,f,s,Yi,Zi,ai){Xl)ds 

\ 1/2 



f-T _ \ 1/2 

Ej \U}-Xl\'^dsj 0{N-^/^). 



By (B3), Propositions 3.1 and 3.3 we have 

rT 



^E^M = ^E™ / {Uf-Xt)-d,H{s,Xlf,sX\Zl,al){Xf)ds 
i=i i=i L^o 

/ rT _ Nl/2 

+ [^J \U}-Xl\'^dsj 0{N-^/^) 



1 ^ 



N 



i=l 
+ 



- Xt) . d,H{s,Xl,fl^:,Y:,Zl,al)iXt)ds 



E / \U}-Xl\'^ds^ ' 0{£N{d)). 
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In order to complete the proof, we evaluate the missing term in the Taylor expansion of T{ in 
namely 

' rT 

''i 7-.N ryi -i 



1 ^ r rT 
j=l L^U 



in order to benefit from the convexity of H. We use Remark 3.4 

T-.N ryi -i 



once more: 



E 







(64) = E 



{(31 - al) ■ d^H{s, XI, , Y:, Zl, a\)ds 



- a\) ■ d^H{s, Xi, fis, Zl a\)ds 



T \ 1/2 

i|2. 



■T X 1/2 

+ ( E / aW'^dsj 0{tN{d)) 



E j \(3l-a\\'dsj 0{iN{d)), 

since a is an optimizer for H. Using the convexity of H and taking advantage of the exchangeability, 
we finally deduce from ( |62| ), ([63]l and ( [64| that there exists a constant c > such that 

N 



i=i •'^ 

-0{iN{d))(l+ sup ^\U}-X}\^] +E /' 1/3, 



T \ 1/2 

1 alPd^. 



By (|6T]) and (|59]), we deduce that 

i-T 



J^{§) > J + cE /" - a^fds 
Jo 





0(^iv(d))fl+ sup E[\Ul -X^\'^]+E [ \/3l-al\''ds 

V 0<t<T Jo 



1/2 



From the inequality 



T 



sup E[|f// - Xl\'\ < CE / 1/3,^ - ail'ds, 

0<t<T Jo 

which holds for some constant C independent of A^, we deduce that 
(65) J^'myJ-ClNid), 
for a possibly new value of C, which proves that 

liminf inf J^(/3) > J. 

N~^+oo /3 — 



In order to prove Theorem 6.1 it thus remains to find a sequence of controls (/3^) n>i such that 



Precisely, we show right below that 



limsup J^(^^) < J. 



limsup J'^(a) < J, 
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thus proving that a = (a^, . . . , a^) is an approximate equilibrium, but of non-Markovian type. 
Denoting by {X^ , . . . , X^) the solution of ([56]) with /3| = a\, classical estimates from the theory of 



propagation of chaos imply (see e.g. lfT6l or ifTOl ) that 



sup ¥.[\Xl - Xl\''] = sup ¥.[\X}-X}\^] =0{N-^). 

0<t<T 0<t<T 

It is then plain to deduce that 

limsup J^(a) < J. 

This completes the proof. □ 

6.2. Approximate Equilibriums with Distributed Closed Loop Controls. When a doesn't depend 
upon a, we are able to provide an approximate equilibrium using only distributed controls in closed 
loop form. This is of real interest from the practical point of view. Indeed, in a such case, the 
optimizer a of the Hamiltonian, as defined in ([39]l, doesn't depend on z. It thus reads as a{t, x, fi, y). 



By Proposition 5.7 this says that the optimal control {at)o<t<T in Theorem 5.1 has the. feedback 
form: 

(66) at = a{t,Xt,^xuv{t,Xt)), tG[0,r]. 

The reader might object that, also in the case when a depends upon a, the process Zt at time t is 
also expected to read as a function of Xt, since such a representation is known to hold in the classical 
decoupled forward-backward setting. Even if we feel that it is indeed possible to prove such a rep- 
resentation in our more general setting, we must address the following points: {i) From a practical 



point of view, ( 66 1 is meaningful if the feedback function is Lipschitz-continuous, as the Lipschitz 



property ensures that the stochastic differential equation obtained by plugging ( [66| ) into the forward 
equation in (|42]) is solvable; {ii) In the current framework, the function v is known to be Lipschitz 
continuous by Proposition |5.7[ but proving the same result for the representation of Zt in terms of 
Xt seems to be really challenging (by the way, it is already challenging in the standard case, i.e. 
without any McKean-Vlasov interaction); {in) We finally mention that, in any case, the relation- 
ship between Zt and Xt, if exists, must be rather intricate as Zt is expected to solve the equation 
Zt = dxv{t, Xt)a{t, Xt,Fxt, a:{t,Xt, Pxt, Yt, Zt)), which can be formally derived by identifying 
martingale integrands when expanding Yt = v{t, Xt) by a formal application of Ito's formula. This 
equation has been investigated in [1 8 J in the standard case, but we feel more convenient not to repeat 
this analysis in the current setting in order to keep things at a reasonable level of complexity. 

Now, for each integer N, we can consider the solution (X/, . . . , Xj^)o<t<T of the system of N 
stochastic differential equations 

1 ^ 

(67) dXl = b{t, XI f,^,a{t, XI fit, v{t, Xl)))dt + a{t, X'l fif)dWl, /^f = ^ ^x^, ' 



5.7 



and the 



with t G [0, T] and Xq = xq. The system (67 1 is well posed since v satisfies Proposition 
minimizer a{t, x, fit, u) is Lipschitz continuous and at most of linear growth in the variables x, /x and 
y, uniformly in t G [0, T\. The processes (X*)i<j<jv give the dynamics of the private states of the 
players in the stochastic differential game of interest when the players use the strategies 

(68) a^'' = a{t,Xlfit,v{t,Xl)), < t < T, i e {1, • • • , TV}. 



FBSDES AND MCKEAN VLASOV 



37 



These strategies are in closed loop form. They are even distributed since, at each time t £ [0, T], a 
player only needs to know the state of his own private state in order to compute the value of the action 
to take at that time. By the linear growth of v and of the minimizer a, it holds, for any p >2, 

(69) sup max ^1" sup < +oo, 

the expectation inside the brackets being actually independent of i since the strategy is obviously 
exchangeable. 

We then have the approximate equilibrium property: 

Theorem 6.2. In addition to assumptions (Bl-4), assume that a doesn't depend upon a. Then, 

J^{/3) > J^(a^)-0(iV-i/('^+4)), 
for any exchangeable (5 = [P^, • • • , /3^), where a is defined in 



Proof. We use the same notations as in the proof of Theorem 6. 1 



Since now reads as a{t, X} , /ij , v{t, X})) for < t < T, we first notice, by the growth property 



of V, that E[supo<t<r l-'^/l^] < +oo for any p > 1. As mentioned in {11) in Remark 3.4 this says 
that £Ar((i) in the Tower bound 

j'^iP) >J-ceN{d), 

see ( |65| ), can be chosen as A^^i/('^+^). 

Moreover, since v{t,-) is Lipschitz continuous, using once again classical estimates from the theory 
of propagation of chaos (see e.g. lll6l or ifTOl ). we also have 

sup E[\Xl-Xl\^] = sup E[\Xl-Xl\^] =OiN-'), 

0<t<T 0<t<T 

so that 

sup E[|af'^-ai|2] = sup ^a^'' - aj\'] = 0{N-'), 

0<t<T 0<t<T 

for any 1 < i < A^. It is then plain to deduce that 

J^(a^) < J + CeN{d). 
This completes the proof. □ 
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